Pause As A Phrase Demarcator For Speech And Language Processing
نویسندگان
چکیده
In spontaneous speech understanding a sophisticated integration of speech recognition and language processing is espceially crucial. However, the two modnles are traditionally designed independently, with independent linguistie rules. In Japanese spc.ech recognition the bunsctsu phrase is the basic processing unit and in language processing the sentence is the basic unit. This difference has made it impracticM to use a unique set of linguistic rules for both types of processing. Further, spontaneous speech contains unexpected utterances other than wellformed sentences, while lingnistic rules for both speech and language processing expect well-formed sentences. They therefore fail to process everyday spoken language. To bridge the gap between speech and language processing, we propose that pauses be treated as phrase demarcators and that the interpausal phrase be the basic common p r o c e s s i n g unit. And to treat the linguistic l)henoI~lena of spoken language properly, we survey relevant features in spontaneous speech data. We then examine the effect of integrating pausal and spontaneous speech phenomena into synt~tctic rules for speech recognition, using 118 sentences. Our experiments show that incorporating pansal phenomena as purely syntactic constraints degrades recognition accuracy considerably, while the additional degradation is minor if some filrther spontaneous speech features are also incorporated. 1 I N T R O D U C T I O N A spontaneous speech understanding system accepts naturally spoken input and understands its meaning. hi such a system, speechprocessing and language processiug must be integrated in a sophisticated manner. Itowew:r, the integration is not straightforward, as the two are stndied independently art(/ have different processing units. Moreover, spontaneous speech contains unexpected phenomena, such as hesitations, corrections and fragmentary expressions, which thus far have not been treated in linguistic rules. The most significant concern in speech processing is raising the recognition accuracy. For that purpose, applying linguistic information, e.g. using stochastic models[l l, syntactic rules[2], sen,antic intbrmation[3] and discourse plan@l], is most promising. In a recent Japanese speech translation system[5] b*lnselsu-based syntactic constraints are successfully applied in the speech processing module[6] 1, However, rules reprel A bunsetsu rouglfly cor responds to a phrase and is the next largest unit af ter the word. T h e nunfl)er of words in a phrase ranges f rom I to 14, art(] the m e a n numl)er is al)ont 317]. senting the same constraints cannot be used directly in sentence-based language processing, where the primary concern is to understand sentence meaning. In speech recognition, a sequence of words forms a bunselsu and a set of bunseisus then forms a sentence. In language processing, on the other hand, where the sentence is the basic processing unit, treating the main verh aud its complements is usually the core of processing. For the sentence kaigi ni moshikomi tai no desu ga, meauing 'I would like to apply for the conference,' the processing discrepancy is sketched in Figure 1:
منابع مشابه
The effects of filled pauses on native and non-native listeners’ speech processing
Everyday speech is abundant with disfluencies. However, little is known about their roles in speech communication. We examined the effects of filled pauses at phrase boundaries on native and non-native listeners in Japanese. Study of spontaneous speech corpus showed that filled pauses tended to precede relatively long and complex constituents. We tested the hypothesis that filled pauses biased ...
متن کاملThe effects of filled pauses on native and non-native listeners2 speech processing
Everyday speech is abundant with disfluencies. However, little is known about their roles in speech communication. We examined the effects of filled pauses at phrase boundaries on native and non-native listeners in Japanese. Study of spontaneous speech corpus showed that filled pauses tended to precede relatively long and complex constituents. We tested the hypothesis that filled pauses biased ...
متن کاملSpeech Planning and Prosodic Phrase Length
A synchronous speech study investigates effects on pause duration of prosodic phrases of different length. The goal is to examine local and distant effects of prosodic phrase length on pause duration. Subjects read 24 English sentences varying along the parameters: a) length in syllables (long or short) of the intonation phrase immediately following a target pause and b) length in syllables (lo...
متن کاملElectrophysiological correlates of intonational phrase perception in infancy and childhood: How necessary is syntactic knowledge?
In the course of acquiring their first language, infants decompose continuous speech into relevant structural units such as clauses and words. Gleitman & Wanner (1982) have suggested, within the framework of prosodic bootstrapping, that infants approach this segmentation problem by exploiting acoustic information in the speech signal which correlates with the occurrence of syntactically signifi...
متن کاملSemi-Supervised Learning of Acoustic Driven Prosodic Phrase Breaks for Text-to-Speech Systems
In this paper, we propose a semi-supervised learning of acoustic driven phrase breaks and its usefulness for text-to-speech systems. In this work, we derive a set of initial hypothesis of phrase breaks in a speech signal using pause as an acoustic cue. As these initial estimates are obtained based on knowledge of speech production and speech signal processing, one could treat the hypothesized p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994